below are the necessary libraries to use for this course

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidycensus)
library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows
options(scipen=999)
options(tigris_class = "sf")

# the below line was changed from the code provided in the book. the operator giver was "=", however that appeared to not save "root.dir" to the environment, so I chaged the operator to "<-"

root.dir <- "https://raw.githubusercontent.com/urbanSpatial/Public-Policy-Analytics-Landing/master/DATA/"
source("https://raw.githubusercontent.com/urbanSpatial/Public-Policy-Analytics-Landing/master/functions.r")

Here is the color palette we are going to use for maps

palette5 <- c("#f0f9e8","#bae4bc","#7bccc4","#43a2ca","#0868ac")

how to install my census api key

census_api_key("c08e727b3d32b823955aea424744c9dda3b30df0", install = TRUE, overwrite = TRUE)
## Your original .Renviron will be backed up and stored in your R HOME directory if needed.
## Your API key has been stored in your .Renviron and can be accessed by Sys.getenv("CENSUS_API_KEY"). 
## To use now, restart R or run `readRenviron("~/.Renviron")`
## [1] "c08e727b3d32b823955aea424744c9dda3b30df0"
# readRenviron("~/.Renviron") is for some reason needed for the api to be active in the current session 
readRenviron("~/.Renviron")

we will be using the following data

from the textbook: In this chapter, the tidycensus is used to query and return Census data directly into R. At the time of publication however, the Census has disabled the year 2000 endpoint/data.10 Instead, the code below downloads an equivalent data frame called tracts00. A data frame is the most common way to store data in R. The code block at the end of this section demonstrates a tidycensus call to return 2017 data.

tracts00 <- 
  st_read(file.path(root.dir,"/Chapter1/PHL_CT00.geojson")) %>% 
    st_transform('ESRI:102728')
## Reading layer `PHL_CT00' from data source 
##   `https://raw.githubusercontent.com/urbanSpatial/Public-Policy-Analytics-Landing/master/DATA///Chapter1/PHL_CT00.geojson' 
##   using driver `GeoJSON'
## Simple feature collection with 2667 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -75.28023 ymin: 39.86591 xmax: -74.95551 ymax: 40.13771
## Geodetic CRS:  WGS 84
tracts00[1:3,]
## Simple feature collection with 3 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2696968 ymin: -68663.64 xmax: 2700290 ymax: -65640.25
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
##         GEOID           NAME  variable value                       geometry
## 1 42101000100 Census Tract 1   P001001  2576 MULTIPOLYGON (((2697812 -65...
## 2 42101000100 Census Tract 1   P006002  2095 MULTIPOLYGON (((2697812 -65...
## 3 42101000100 Census Tract 1 PCT025050    48 MULTIPOLYGON (((2697812 -65...

to find out how many census plots we have in our geography (Philadelphia), we can create a table the displays how many rows we have by variable. our data is long formatted so there are multiple rows representing a given census plot.

table(tracts00$variable)
## 
##   H056001   P001001   P006002   P053001   P092001 PCT025009 PCT025050 
##       381       381       381       381       381       381       381

now we are going to plot population by census plot in the year 2000. To do that, we are going to create a variable the has filtered for only rows where the variable is that representing population in the year 2000.

totalpop00 <- 
    tracts00 %>%
    filter(variable == "P001001")
#lets make sure we have the correct number of rows (381)
nrow(totalpop00)    
## [1] 381
# lets see a snapshot of our data
head(totalpop00)
## Simple feature collection with 6 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2688965 ymin: -68894.43 xmax: 2700290 ymax: -64443.03
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
##         GEOID           NAME variable value                       geometry
## 1 42101000100 Census Tract 1  P001001  2576 MULTIPOLYGON (((2697812 -65...
## 2 42101000200 Census Tract 2  P001001  1355 MULTIPOLYGON (((2696038 -65...
## 3 42101000300 Census Tract 3  P001001  2577 MULTIPOLYGON (((2690929 -64...
## 4 42101000400 Census Tract 4  P001001  4387 MULTIPOLYGON (((2693009 -66...
## 5 42101000500 Census Tract 5  P001001  1071 MULTIPOLYGON (((2697206 -67...
## 6 42101000600 Census Tract 6  P001001  1384 MULTIPOLYGON (((2696968 -68...

lets take a look at what the basic “plot function does for an {sf} layer

plot(totalpop00)

the textbook illustrates how to make progressively more useful maps using the ggplot function with geom_sf layers

from the textbook: A maps totalPop00, using the aes or aesthetic parameter to fill the tract polygons by the value field.

Plot B converts value to 5 quintile categories using the q5 function. These 5 categories are of class factor. Try q5(totalPop00$value).

Plot C adds fill color and legend improvements using the scale_fill_manual function. Many different scale types are possible.13 values is set to a list of colors, palette5. labels is set to the values of the associated quintiles (try qBr(totalPop00, “value”)). Finally, a legend name is added. inserts a hard return.

Plot D inserts a title using the labs parameter, as well as a mapTheme

Plot_A <- 
  ggplot() +
  geom_sf(data = totalpop00, aes(fill = value))

Plot_B <- 
  ggplot() +
  geom_sf(data = totalpop00, aes(fill = q5(value)))

Plot_C <-
  ggplot() +
  geom_sf(data = totalpop00, aes(fill = q5(value))) +
  scale_fill_manual(values = palette5,
                    labels = qBr(totalpop00, "value"),
                    name = "Total\nPopluation\n(Quintile Breaks)")

Plot_D <- 
  ggplot() +
  geom_sf(data = totalpop00, aes(fill = q5(value))) +
  scale_fill_manual(values = palette5,
                    labels = qBr(totalpop00, "value"),
                    name = "Popluation\n(Quintile Breaks)") +
  labs(title = "Total Population", subtitle = "Philadelphia; 2000") +
  mapTheme()
## Warning: The `size` argument of `element_rect()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

the above code saves our plots to their assigned variables. below we can print them.

print(Plot_A)

print(Plot_B)

print(Plot_C)

print(Plot_D)

next, the textbook converts the data from long format to wide format and in general makes it easier to read.

tracts00 <- 
  dplyr::select(tracts00, -NAME) %>%
    spread(variable, value) %>%
    dplyr::select(-geometry) %>%
    rename(TotalPop = P001001, Whites = P006002, MaleBachelors = PCT025009,
           FemaleBachelors = PCT025050, MedHHInc = P053001, MedRent = H056001,
           TotalPoverty = P092001) 

st_drop_geometry(tracts00)[1:3,]
##         GEOID MedRent TotalPop Whites MedHHInc TotalPoverty MaleBachelors
## 1 42101000100     858     2576   2095    48886         1739            64
## 2 42101000200     339     1355    176     8349          505            23
## 3 42101000300     660     2577   1893    40625         1189            41
##   FemaleBachelors
## 1              48
## 2              73
## 3             103

lets change our data to wide format

tracts00 <- 
  tracts00 %>%
  mutate(pctWhite = ifelse(TotalPop > 0, Whites / TotalPop, 0),
         pctBachelors = ifelse(TotalPop > 0, ((FemaleBachelors + MaleBachelors) / TotalPop), 0),
         pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
         year = "2000") %>%
  dplyr::select(-Whites,-FemaleBachelors,-MaleBachelors,-TotalPoverty)

now we are going to query 2017 acs data to match our 2000 data. I am going to change the code from the textbook so that we can see the data change which each line of code.

tracts17 <- 
  get_acs(geography = "tract", variables = c("B25026_001E","B02001_002E","B15001_050E",
                                             "B15001_009E","B19013_001E","B25058_001E",
                                             "B06012_002E"), 
          year=2017, state=42, county=101, geometry=T, output="wide") %>%
    #the above code specifies the query from the census API, note that more than just the specified variables are generated as columns. I'm not sure what the additional variables represent.
    print() %>%
  st_transform('ESRI:102728') %>%
    # changes the CRS to what we have been using
    print() %>%
  rename(TotalPop = B25026_001E, Whites = B02001_002E,
         FemaleBachelors = B15001_050E, MaleBachelors = B15001_009E,
         MedHHInc = B19013_001E, MedRent = B25058_001E,
         TotalPoverty = B06012_002E) %>%
    #names our columns 
    print() %>%
  dplyr::select(-NAME, -starts_with("B")) %>%
    #removes names and all the columns that start with "b". all the variable codes start with b, so any columns that we didnt name get deleted
    print() %>%
  mutate(pctWhite = ifelse(TotalPop > 0, Whites / TotalPop,0),
         pctBachelors = ifelse(TotalPop > 0, ((FemaleBachelors + MaleBachelors) / TotalPop),0),
         pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
         year = "2017") %>%
    #calculates our statistics of interest as a rate instead of a count. we use ifelse functions to account for cases in which we have 0 numerator or denominator. that could be a result of incomplete data.
    print() %>%
  dplyr::select(-Whites, -FemaleBachelors, -MaleBachelors, -TotalPoverty) 
## Getting data from the 2013-2017 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |========================================================              |  81%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%
## Simple feature collection with 384 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -75.28027 ymin: 39.86705 xmax: -74.95578 ymax: 40.13799
## Geodetic CRS:  NAD83
## First 10 features:
##          GEOID                                                  NAME
## 1  42101000200     Census Tract 2, Philadelphia County, Pennsylvania
## 2  42101000500     Census Tract 5, Philadelphia County, Pennsylvania
## 3  42101000901  Census Tract 9.01, Philadelphia County, Pennsylvania
## 4  42101001002 Census Tract 10.02, Philadelphia County, Pennsylvania
## 5  42101002000    Census Tract 20, Philadelphia County, Pennsylvania
## 6  42101003901 Census Tract 39.01, Philadelphia County, Pennsylvania
## 7  42101007102 Census Tract 71.02, Philadelphia County, Pennsylvania
## 8  42101008701 Census Tract 87.01, Philadelphia County, Pennsylvania
## 9  42101010200   Census Tract 102, Philadelphia County, Pennsylvania
## 10 42101011500   Census Tract 115, Philadelphia County, Pennsylvania
##    B25026_001E B25026_001M B02001_002E B02001_002M B15001_050E B15001_050M
## 1         2479         424         698         196         145          90
## 2         1375         124        1096         292          61          27
## 3         1949         179        1452         209          96          49
## 4         3259         264        3056         251          26          23
## 5         2154         222         534         133           6           7
## 6         6583         810        4597         840         124          91
## 7         4944         740         174         127           7          10
## 8         4356         499        2194         358         206         101
## 9         3311         481           0          10           0          10
## 10        4686         431         189          95           0          10
##    B15001_009E B15001_009M B19013_001E B19013_001M B25058_001E B25058_001M
## 1          113          62       30746        8706        1051         128
## 2           55          45       42135       12885        1363         174
## 3           39          30       42548        9262        1095          62
## 4           15          14      109020       12211        1598         103
## 5           39          40       41335        6339         841          47
## 6           40          47       38301        3245         774          49
## 7            0          10       31332        6631         708         101
## 8          251         134       44225        4124         908          52
## 9            0          10       18989        5279         486          79
## 10          18          28       51155        4231         788          28
##    B06012_002E B06012_002M                       geometry
## 1          962         378 MULTIPOLYGON (((-75.16234 3...
## 2          536         107 MULTIPOLYGON (((-75.16506 3...
## 3          473         131 MULTIPOLYGON (((-75.16484 3...
## 4          211         163 MULTIPOLYGON (((-75.15034 3...
## 5          365         139 MULTIPOLYGON (((-75.18824 3...
## 6         1423         504 MULTIPOLYGON (((-75.17691 3...
## 7         1715         610 MULTIPOLYGON (((-75.23363 3...
## 8         1094         320 MULTIPOLYGON (((-75.21272 3...
## 9         1337         469 MULTIPOLYGON (((-75.23536 3...
## 10         739         315 MULTIPOLYGON (((-75.25592 3...
## Simple feature collection with 384 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2660774 ymin: -98728.57 xmax: 2750287 ymax: 1626.778
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
## First 10 features:
##          GEOID                                                  NAME
## 1  42101000200     Census Tract 2, Philadelphia County, Pennsylvania
## 2  42101000500     Census Tract 5, Philadelphia County, Pennsylvania
## 3  42101000901  Census Tract 9.01, Philadelphia County, Pennsylvania
## 4  42101001002 Census Tract 10.02, Philadelphia County, Pennsylvania
## 5  42101002000    Census Tract 20, Philadelphia County, Pennsylvania
## 6  42101003901 Census Tract 39.01, Philadelphia County, Pennsylvania
## 7  42101007102 Census Tract 71.02, Philadelphia County, Pennsylvania
## 8  42101008701 Census Tract 87.01, Philadelphia County, Pennsylvania
## 9  42101010200   Census Tract 102, Philadelphia County, Pennsylvania
## 10 42101011500   Census Tract 115, Philadelphia County, Pennsylvania
##    B25026_001E B25026_001M B02001_002E B02001_002M B15001_050E B15001_050M
## 1         2479         424         698         196         145          90
## 2         1375         124        1096         292          61          27
## 3         1949         179        1452         209          96          49
## 4         3259         264        3056         251          26          23
## 5         2154         222         534         133           6           7
## 6         6583         810        4597         840         124          91
## 7         4944         740         174         127           7          10
## 8         4356         499        2194         358         206         101
## 9         3311         481           0          10           0          10
## 10        4686         431         189          95           0          10
##    B15001_009E B15001_009M B19013_001E B19013_001M B25058_001E B25058_001M
## 1          113          62       30746        8706        1051         128
## 2           55          45       42135       12885        1363         174
## 3           39          30       42548        9262        1095          62
## 4           15          14      109020       12211        1598         103
## 5           39          40       41335        6339         841          47
## 6           40          47       38301        3245         774          49
## 7            0          10       31332        6631         708         101
## 8          251         134       44225        4124         908          52
## 9            0          10       18989        5279         486          79
## 10          18          28       51155        4231         788          28
##    B06012_002E B06012_002M                       geometry
## 1          962         378 MULTIPOLYGON (((2694006 -65...
## 2          536         107 MULTIPOLYGON (((2693289 -66...
## 3          473         131 MULTIPOLYGON (((2693422 -69...
## 4          211         163 MULTIPOLYGON (((2697524 -70...
## 5          365         139 MULTIPOLYGON (((2686940 -72...
## 6         1423         504 MULTIPOLYGON (((2690317 -78...
## 7         1715         610 MULTIPOLYGON (((2674209 -72...
## 8         1094         320 MULTIPOLYGON (((2679966 -68...
## 9         1337         469 MULTIPOLYGON (((2673427 -61...
## 10         739         315 MULTIPOLYGON (((2667564 -58...
## Simple feature collection with 384 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2660774 ymin: -98728.57 xmax: 2750287 ymax: 1626.778
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
## First 10 features:
##          GEOID                                                  NAME TotalPop
## 1  42101000200     Census Tract 2, Philadelphia County, Pennsylvania     2479
## 2  42101000500     Census Tract 5, Philadelphia County, Pennsylvania     1375
## 3  42101000901  Census Tract 9.01, Philadelphia County, Pennsylvania     1949
## 4  42101001002 Census Tract 10.02, Philadelphia County, Pennsylvania     3259
## 5  42101002000    Census Tract 20, Philadelphia County, Pennsylvania     2154
## 6  42101003901 Census Tract 39.01, Philadelphia County, Pennsylvania     6583
## 7  42101007102 Census Tract 71.02, Philadelphia County, Pennsylvania     4944
## 8  42101008701 Census Tract 87.01, Philadelphia County, Pennsylvania     4356
## 9  42101010200   Census Tract 102, Philadelphia County, Pennsylvania     3311
## 10 42101011500   Census Tract 115, Philadelphia County, Pennsylvania     4686
##    B25026_001M Whites B02001_002M FemaleBachelors B15001_050M MaleBachelors
## 1          424    698         196             145          90           113
## 2          124   1096         292              61          27            55
## 3          179   1452         209              96          49            39
## 4          264   3056         251              26          23            15
## 5          222    534         133               6           7            39
## 6          810   4597         840             124          91            40
## 7          740    174         127               7          10             0
## 8          499   2194         358             206         101           251
## 9          481      0          10               0          10             0
## 10         431    189          95               0          10            18
##    B15001_009M MedHHInc B19013_001M MedRent B25058_001M TotalPoverty
## 1           62    30746        8706    1051         128          962
## 2           45    42135       12885    1363         174          536
## 3           30    42548        9262    1095          62          473
## 4           14   109020       12211    1598         103          211
## 5           40    41335        6339     841          47          365
## 6           47    38301        3245     774          49         1423
## 7           10    31332        6631     708         101         1715
## 8          134    44225        4124     908          52         1094
## 9           10    18989        5279     486          79         1337
## 10          28    51155        4231     788          28          739
##    B06012_002M                       geometry
## 1          378 MULTIPOLYGON (((2694006 -65...
## 2          107 MULTIPOLYGON (((2693289 -66...
## 3          131 MULTIPOLYGON (((2693422 -69...
## 4          163 MULTIPOLYGON (((2697524 -70...
## 5          139 MULTIPOLYGON (((2686940 -72...
## 6          504 MULTIPOLYGON (((2690317 -78...
## 7          610 MULTIPOLYGON (((2674209 -72...
## 8          320 MULTIPOLYGON (((2679966 -68...
## 9          469 MULTIPOLYGON (((2673427 -61...
## 10         315 MULTIPOLYGON (((2667564 -58...
## Simple feature collection with 384 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2660774 ymin: -98728.57 xmax: 2750287 ymax: 1626.778
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
## First 10 features:
##          GEOID TotalPop Whites FemaleBachelors MaleBachelors MedHHInc MedRent
## 1  42101000200     2479    698             145           113    30746    1051
## 2  42101000500     1375   1096              61            55    42135    1363
## 3  42101000901     1949   1452              96            39    42548    1095
## 4  42101001002     3259   3056              26            15   109020    1598
## 5  42101002000     2154    534               6            39    41335     841
## 6  42101003901     6583   4597             124            40    38301     774
## 7  42101007102     4944    174               7             0    31332     708
## 8  42101008701     4356   2194             206           251    44225     908
## 9  42101010200     3311      0               0             0    18989     486
## 10 42101011500     4686    189               0            18    51155     788
##    TotalPoverty                       geometry
## 1           962 MULTIPOLYGON (((2694006 -65...
## 2           536 MULTIPOLYGON (((2693289 -66...
## 3           473 MULTIPOLYGON (((2693422 -69...
## 4           211 MULTIPOLYGON (((2697524 -70...
## 5           365 MULTIPOLYGON (((2686940 -72...
## 6          1423 MULTIPOLYGON (((2690317 -78...
## 7          1715 MULTIPOLYGON (((2674209 -72...
## 8          1094 MULTIPOLYGON (((2679966 -68...
## 9          1337 MULTIPOLYGON (((2673427 -61...
## 10          739 MULTIPOLYGON (((2667564 -58...
## Simple feature collection with 384 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2660774 ymin: -98728.57 xmax: 2750287 ymax: 1626.778
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
## First 10 features:
##          GEOID TotalPop Whites FemaleBachelors MaleBachelors MedHHInc MedRent
## 1  42101000200     2479    698             145           113    30746    1051
## 2  42101000500     1375   1096              61            55    42135    1363
## 3  42101000901     1949   1452              96            39    42548    1095
## 4  42101001002     3259   3056              26            15   109020    1598
## 5  42101002000     2154    534               6            39    41335     841
## 6  42101003901     6583   4597             124            40    38301     774
## 7  42101007102     4944    174               7             0    31332     708
## 8  42101008701     4356   2194             206           251    44225     908
## 9  42101010200     3311      0               0             0    18989     486
## 10 42101011500     4686    189               0            18    51155     788
##    TotalPoverty                       geometry   pctWhite pctBachelors
## 1           962 MULTIPOLYGON (((2694006 -65... 0.28156515  0.104074223
## 2           536 MULTIPOLYGON (((2693289 -66... 0.79709091  0.084363636
## 3           473 MULTIPOLYGON (((2693422 -69... 0.74499743  0.069266290
## 4           211 MULTIPOLYGON (((2697524 -70... 0.93771095  0.012580546
## 5           365 MULTIPOLYGON (((2686940 -72... 0.24791086  0.020891365
## 6          1423 MULTIPOLYGON (((2690317 -78... 0.69831384  0.024912654
## 7          1715 MULTIPOLYGON (((2674209 -72... 0.03519417  0.001415858
## 8          1094 MULTIPOLYGON (((2679966 -68... 0.50367309  0.104912764
## 9          1337 MULTIPOLYGON (((2673427 -61... 0.00000000  0.000000000
## 10          739 MULTIPOLYGON (((2667564 -58... 0.04033291  0.003841229
##    pctPoverty year
## 1  0.38805970 2017
## 2  0.38981818 2017
## 3  0.24268856 2017
## 4  0.06474379 2017
## 5  0.16945218 2017
## 6  0.21616284 2017
## 7  0.34688511 2017
## 8  0.25114784 2017
## 9  0.40380550 2017
## 10 0.15770380 2017
    #removes the selected columns. the - operator specifies that we want everything besides the specified columns
head(tracts17)
## Simple feature collection with 6 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2686791 ymin: -79202.01 xmax: 2699682 ymax: -65257.88
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
##         GEOID TotalPop MedHHInc MedRent  pctWhite pctBachelors pctPoverty year
## 1 42101000200     2479    30746    1051 0.2815651   0.10407422 0.38805970 2017
## 2 42101000500     1375    42135    1363 0.7970909   0.08436364 0.38981818 2017
## 3 42101000901     1949    42548    1095 0.7449974   0.06926629 0.24268856 2017
## 4 42101001002     3259   109020    1598 0.9377110   0.01258055 0.06474379 2017
## 5 42101002000     2154    41335     841 0.2479109   0.02089136 0.16945218 2017
## 6 42101003901     6583    38301     774 0.6983138   0.02491265 0.21616284 2017
##                         geometry
## 1 MULTIPOLYGON (((2694006 -65...
## 2 MULTIPOLYGON (((2693289 -66...
## 3 MULTIPOLYGON (((2693422 -69...
## 4 MULTIPOLYGON (((2697524 -70...
## 5 MULTIPOLYGON (((2686940 -72...
## 6 MULTIPOLYGON (((2690317 -78...

now we are going to combine our two data sets, one from 2000, one from 2017, together

allTracts <- rbind(tracts00,tracts17)

philly transit open data

from the textbook: The code below downloads and binds together El and Broad St. station locations into a single layer, septaStops. st_read downloads the data in geojson form (with geometries) from the web. A Line field is generated and selected along with the Station field. Lastly, the data is projected into the same coordinate system as tracts00.

septaStops <- 
  rbind(
    st_read("https://opendata.arcgis.com/datasets/8c6e2575c8ad46eb887e6bb35825e1a6_0.geojson") %>% 
    mutate(Line = "El") %>% #this line adds the label 'El' to a newly created "Line" Column 
    select(Station, Line),
      st_read("https://opendata.arcgis.com/datasets/2e9037fd5bef406488ffe5bb67d21312_0.geojson") %>%
      mutate(Line ="Broad_St") %>%
      select(Station, Line)) %>%
   st_transform(st_crs(tracts00))  #note how you can call the crs from the tracts00 dataframe using str_crs
## Reading layer `SEPTA_-_Market-Frankford_Line_Stations' from data source 
##   `https://opendata.arcgis.com/datasets/8c6e2575c8ad46eb887e6bb35825e1a6_0.geojson' 
##   using driver `GeoJSON'
## Simple feature collection with 28 features and 45 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -75.25973 ymin: 39.94989 xmax: -75.07788 ymax: 40.02299
## Geodetic CRS:  WGS 84
## Reading layer `SEPTA_-_Broad_Street_Line_Stations' from data source 
##   `https://opendata.arcgis.com/datasets/2e9037fd5bef406488ffe5bb67d21312_0.geojson' 
##   using driver `GeoJSON'
## Simple feature collection with 24 features and 45 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -75.17394 ymin: 39.90543 xmax: -75.13679 ymax: 40.04185
## Geodetic CRS:  WGS 84

lets map our stops onto the city

ggplot() + #creates the plot
  geom_sf(data=st_union(tracts00)) + #adds the dissolved tract boundaries city map as a layer
  geom_sf(data=septaStops, aes(colour = Line), show.legend = "point", size= 2) + # adds the septaStops as a layer to the map with different colors for each of the values in the Line column
  scale_colour_manual(values = c("orange","blue")) +
  labs(title="Septa Stops", subtitle="Philadelphia, PA") +
  mapTheme()

now we are going to create “buffer” areas around each stop to be considered as transit oriented areas

septaBuffers <- 
  rbind(
    st_buffer(septaStops, 2640) %>% # creates the buffers of 2640 feet (1/2 mile).
      mutate(Legend = "Buffer") %>% # creates legend column with a buffer label for each stop buffer
      dplyr::select(Legend), #selects only the legend column along with the geometry to remain in the dataframe
    st_union(st_buffer(septaStops, 2640)) %>% 
      st_sf() %>% #this function is for some reason needed to convert the union transit areas created with the last line into an sf layer
      mutate(Legend = "Unioned Buffer"))

ggplot() +
  geom_sf(data=septaBuffers) +
  geom_sf(data=septaStops, show.legend = "point") +
  facet_wrap(~Legend) + #facet wrap can only be performed on long data
  mapTheme()

now we need to select which tracts constitute part of the trasit oriented area. there are three methods that the book discusses for selecting tracts. The first is clipping, which selects the interesections of the tract and the buffer. This is problematic because the data for each tract aligns with the whole tract and its unclear how the clipping method divides the other variables if at all after subtracting area.

buffer <- filter(septaBuffers, Legend=="Unioned Buffer")

clip <- 
  st_intersection(buffer, tracts00) %>%
    dplyr::select(TotalPop) %>%
    mutate(Selection_Type = "Clip")
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries

the next method is to select any geometries that intersect with the buffer

selection <- 
  tracts00[buffer,] %>%
    dplyr::select(TotalPop) %>%
    mutate(Selection_Type = "Spatial Selection")

and the last method is to choose all tracts of which’s “center of gravity” fall within the buffer. you imigane the ceter of gravity as at what point you would need to put your finger to balance the shape atop of it. This is the preffered method of the author.

selectCentroids <-
  st_centroid(tracts00)[buffer,] %>% #creates centroid point geometry for tracts whose centroid is within the buffer
print() %>%
    st_drop_geometry() %>% #this converts the sf into a dataframe to allow for it to be joined back with the original sf. I think that this is necessary because there cannot be more than one geometry per row in an sf
print() %>%
    left_join(dplyr::select(tracts00, GEOID)) %>% #this joins the two data frames by the unique geoids. so it seems that since our centroid frame is the first variable, we get only matches that exist from that frame. and what we are matching it to is only the GEOID from tracts00. since tracts is a sf it seems like you are unable to select away the geometry
print() %>%
    st_sf() %>% # converts dataframe back to sf
    dplyr::select(TotalPop) %>% #for some reason we only want total population
    mutate(Selection_Type = "Select by Centroids") #creates the select by centroids value in the Selection_type column. this will be useful for the side by side map we are going to create
## Warning: st_centroid assumes attributes are constant over geometries
## Simple feature collection with 97 features and 8 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 2669191 ymin: -83405.65 xmax: 2716999 ymax: -34081.94
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
## First 10 features:
##          GEOID MedRent TotalPop MedHHInc  pctWhite pctBachelors pctPoverty year
## 1  42101000100     858     2576    48886 0.8132764  0.043478261  0.6750776 2000
## 2  42101000200     339     1355     8349 0.1298893  0.070848708  0.3726937 2000
## 3  42101000300     660     2577    40625 0.7345751  0.055878929  0.4613892 2000
## 4  42101000400     601     4387    27400 0.7611124  0.028949168  0.6856622 2000
## 5  42101000500     501     1071     9620 0.3968254  0.006535948  0.2978525 2000
## 6  42101000600     913     1384    41563 0.7037572  0.067919075  0.4017341 2000
## 7  42101000700     844     2550    34536 0.8333333  0.075686275  0.7101961 2000
## 8  42101000800     904     8461    42346 0.8419809  0.066422409  0.7229642 2000
## 9  42101000900     575     4969    20725 0.7037633  0.146508352  0.6415778 2000
## 10 42101001000     873     5808    72625 0.9133953  0.011019284  0.6234504 2000
##                     geometry
## 1  POINT (2698593 -67134.68)
## 2  POINT (2695637 -66133.72)
## 3  POINT (2691672 -65624.79)
## 4   POINT (2691447 -66938.9)
## 5  POINT (2695339 -67444.62)
## 6  POINT (2695341 -68393.71)
## 7  POINT (2691290 -67859.72)
## 8  POINT (2691053 -68560.58)
## 9  POINT (2695146 -69101.59)
## 10 POINT (2698155 -69801.43)
##           GEOID MedRent TotalPop MedHHInc    pctWhite pctBachelors pctPoverty
## 1   42101000100     858     2576    48886 0.813276398 0.0434782609 0.67507764
## 2   42101000200     339     1355     8349 0.129889299 0.0708487085 0.37269373
## 3   42101000300     660     2577    40625 0.734575087 0.0558789290 0.46138921
## 4   42101000400     601     4387    27400 0.761112377 0.0289491680 0.68566218
## 5   42101000500     501     1071     9620 0.396825397 0.0065359477 0.29785247
## 6   42101000600     913     1384    41563 0.703757225 0.0679190751 0.40173410
## 7   42101000700     844     2550    34536 0.833333333 0.0756862745 0.71019608
## 8   42101000800     904     8461    42346 0.841980853 0.0664224087 0.72296419
## 9   42101000900     575     4969    20725 0.703763333 0.1465083518 0.64157778
## 10  42101001000     873     5808    72625 0.913395317 0.0110192837 0.62345041
## 11  42101001100     652     5925    36564 0.797468354 0.0494514768 0.65029536
## 14  42101001400     450     3528    26897 0.320011338 0.0229591837 0.52636054
## 15  42101001500     622     2192    38026 0.598540146 0.0369525547 0.53102190
## 18  42101001800     514     2462    36458 0.689277011 0.0170593014 0.51056052
## 19  42101001900     422     2689    21766 0.024172555 0.0081814801 0.38750465
## 22  42101002200     487     2135    30156 0.121779859 0.0051522248 0.41358314
## 23  42101002300     425     2644    29806 0.660363086 0.0230711044 0.43154312
## 24  42101002400     474     4262    34247 0.634209291 0.0229938996 0.47536368
## 28  42101002800     427     9255    22759 0.613182064 0.0143706105 0.39535386
## 29  42101002900     488     4050    26744 0.855308642 0.0214814815 0.48790123
## 30  42101003000     410     8160    20294 0.150000000 0.0019607843 0.34595588
## 40  42101003901     463     6192    26347 0.749677003 0.0050064599 0.43782300
## 41  42101003902     468     5630    36472 0.970692718 0.0031971581 0.43570160
## 42  42101004001     457     4036    25854 0.921704658 0.0101585728 0.45936571
## 43  42101004002     484     5024    31045 0.984076433 0.0151273885 0.43829618
## 50  42101004500     520     3049    36335 0.951459495 0.0062315513 0.44375205
## 52  42101004700     622     2540    44080 0.973622047 0.0031496063 0.41141732
## 53  42101004800     638      500    30455 1.000000000 0.0000000000 0.42000000
## 87  42101008301     472     4420    24800 0.020361991 0.0029411765 0.39321267
## 88  42101008302     425     4535    24567 0.005071665 0.0000000000 0.40044101
## 89  42101008400     414     4798    26341 0.015006253 0.0062526053 0.37265527
## 90  42101008500     381     6539    22077 0.028903502 0.0051995718 0.40266096
## 91  42101008600     443     6224    19612 0.154723650 0.0215295630 0.45983290
## 92  42101008700     507     6733    21131 0.457151344 0.0957968216 0.50378732
## 93  42101008800     590     8307     9320 0.584326472 0.1001564945 0.25954015
## 94  42101008900     538     3343     6311 0.561471732 0.1011067903 0.15465151
## 95  42101009000     571     5335    13792 0.621368322 0.0391752577 0.29090909
## 96  42101009100     350     2737    17500 0.299963464 0.0226525393 0.51589331
## 97  42101009200     307     3150    20083 0.026666667 0.0111111111 0.38634921
## 98  42101009300     377     4609    21503 0.002603602 0.0000000000 0.40203949
## 99  42101009400     285     3871    17743 0.009299923 0.0000000000 0.42934642
## 100 42101009500     368     3588    21058 0.000000000 0.0027870680 0.38099220
## 101 42101009600     404     4504    23171 0.017984014 0.0000000000 0.37655417
## 105 42101010000     421     4256    31415 0.336701128 0.0063439850 0.37946429
## 107 42101010200     331     3240    20634 0.008950617 0.0024691358 0.41172840
## 109 42101010400     374     3498    21772 0.010863350 0.0022870212 0.36678102
## 131 42101012600     916      801    31838 0.485642946 0.0399500624 0.40823970
## 132 42101012700     223      689     7500 0.076923077 0.0000000000 0.33671988
## 133 42101012800     841      231    39643 0.831168831 0.0303030303 0.64935065
## 134 42101012900     586      526    41000 0.922053232 0.0000000000 0.54372624
## 135 42101013000     573     1003    35625 0.417746760 0.0368893320 0.50149551
## 137 42101013200     177     2190    14375 0.051598174 0.0000000000 0.31735160
## 138 42101013300     492     2217    29352 0.274695535 0.0211998196 0.46098331
## 145 42101014000     370     2855    13458 0.042732049 0.0122591944 0.41015762
## 146 42101014100     210     2366    12165 0.094674556 0.0000000000 0.41758242
## 147 42101014200     565     2091    30862 0.608321377 0.0162601626 0.45911047
## 149 42101014400     323     3217    23720 0.393534349 0.0027976376 0.35685421
## 151 42101014600     384     3231    26295 0.194986072 0.0000000000 0.30238316
## 152 42101014700     306     2449    16411 0.008166599 0.0000000000 0.31073908
## 158 42101015300     353     3382    14826 0.068598462 0.0165582496 0.34920166
## 159 42101015400     489     2316    31923 0.410621762 0.0000000000 0.07340242
## 161 42101015600     326     1989    14524 0.227249874 0.0045248869 0.29562594
## 162 42101015700     339     2759    22478 0.355201160 0.0050743023 0.32584270
## 163 42101015800     387     5854    31862 0.953877690 0.0015374103 0.38503587
## 166 42101016100     349     5435    19098 0.605887764 0.0000000000 0.32750690
## 167 42101016200     337     2442    13833 0.223177723 0.0000000000 0.32186732
## 168 42101016300     341     3776    15865 0.181938559 0.0052966102 0.28257415
## 170 42101016500     330     2722    16964 0.011756062 0.0022042616 0.33357825
## 171 42101016600     353     1630    14250 0.025766871 0.0055214724 0.39693252
## 172 42101016700     363     6947    16367 0.002303152 0.0061897222 0.37339859
## 179 42101017300     427     3126    21731 0.008637236 0.0000000000 0.36852207
## 180 42101017400     339     2667    15050 0.024371954 0.0029996250 0.34833146
## 182 42101017601     325     5863    11909 0.240661777 0.0010233669 0.30291660
## 184 42101017700     370     9159    15687 0.313789715 0.0005459111 0.28976963
## 185 42101017800     389     6340    14793 0.502996845 0.0037854890 0.33280757
## 195 42101018800     385     7257    22489 0.686371779 0.0000000000 0.34463277
## 196 42101018900     412      942    24032 0.580679406 0.0084925690 0.33757962
## 197 42101019000     427     6869    24650 0.591789198 0.0000000000 0.32581162
## 199 42101019200     390     7519    19375 0.408032983 0.0014629605 0.30110387
## 204 42101019700     422     7139    19877 0.186300602 0.0016809077 0.29079703
## 207 42101020000     378     1756    22786 0.185649203 0.0182232346 0.30410023
## 208 42101020100     349     7732    20436 0.018882566 0.0010346611 0.40597517
## 210 42101020300     371     3147    23277 0.032729584 0.0019065777 0.34413727
## 211 42101020400     386     3398    29814 0.016774573 0.0020600353 0.34873455
## 283 42101027500     472     4823    33953 0.226622434 0.0008293593 0.32821895
## 284 42101027600     463     3932    31316 0.033825025 0.0073753815 0.39267548
## 286 42101027800     426     4867    25690 0.034929114 0.0000000000 0.41956030
## 288 42101028000     485     4819    28281 0.021996265 0.0076779415 0.35692052
## 289 42101028100     401     4051    36129 0.057516663 0.0108615157 0.32189583
## 290 42101028200     457     5448    29142 0.137114537 0.0014684288 0.30231278
## 291 42101028300     425     7270    24725 0.048693260 0.0011004127 0.33301238
## 301 42101029300     375     2898    19205 0.652173913 0.0027605245 0.34817115
## 302 42101029400     386     3232    21288 0.368811881 0.0021658416 0.34096535
## 308 42101030000     388     7150    23876 0.322517483 0.0036363636 0.35356643
## 309 42101030100     427     6050    32224 0.505289256 0.0019834711 0.36082645
## 310 42101030200     411     6694    39072 0.695100090 0.0043322378 0.36241410
## 381 42101036600    1505      618    87027 0.953074434 0.0161812298 0.61650485
##     year
## 1   2000
## 2   2000
## 3   2000
## 4   2000
## 5   2000
## 6   2000
## 7   2000
## 8   2000
## 9   2000
## 10  2000
## 11  2000
## 14  2000
## 15  2000
## 18  2000
## 19  2000
## 22  2000
## 23  2000
## 24  2000
## 28  2000
## 29  2000
## 30  2000
## 40  2000
## 41  2000
## 42  2000
## 43  2000
## 50  2000
## 52  2000
## 53  2000
## 87  2000
## 88  2000
## 89  2000
## 90  2000
## 91  2000
## 92  2000
## 93  2000
## 94  2000
## 95  2000
## 96  2000
## 97  2000
## 98  2000
## 99  2000
## 100 2000
## 101 2000
## 105 2000
## 107 2000
## 109 2000
## 131 2000
## 132 2000
## 133 2000
## 134 2000
## 135 2000
## 137 2000
## 138 2000
## 145 2000
## 146 2000
## 147 2000
## 149 2000
## 151 2000
## 152 2000
## 158 2000
## 159 2000
## 161 2000
## 162 2000
## 163 2000
## 166 2000
## 167 2000
## 168 2000
## 170 2000
## 171 2000
## 172 2000
## 179 2000
## 180 2000
## 182 2000
## 184 2000
## 185 2000
## 195 2000
## 196 2000
## 197 2000
## 199 2000
## 204 2000
## 207 2000
## 208 2000
## 210 2000
## 211 2000
## 283 2000
## 284 2000
## 286 2000
## 288 2000
## 289 2000
## 290 2000
## 291 2000
## 301 2000
## 302 2000
## 308 2000
## 309 2000
## 310 2000
## 381 2000
## Joining with `by = join_by(GEOID)`
##          GEOID MedRent TotalPop MedHHInc    pctWhite pctBachelors pctPoverty
## 1  42101000100     858     2576    48886 0.813276398 0.0434782609 0.67507764
## 2  42101000200     339     1355     8349 0.129889299 0.0708487085 0.37269373
## 3  42101000300     660     2577    40625 0.734575087 0.0558789290 0.46138921
## 4  42101000400     601     4387    27400 0.761112377 0.0289491680 0.68566218
## 5  42101000500     501     1071     9620 0.396825397 0.0065359477 0.29785247
## 6  42101000600     913     1384    41563 0.703757225 0.0679190751 0.40173410
## 7  42101000700     844     2550    34536 0.833333333 0.0756862745 0.71019608
## 8  42101000800     904     8461    42346 0.841980853 0.0664224087 0.72296419
## 9  42101000900     575     4969    20725 0.703763333 0.1465083518 0.64157778
## 10 42101001000     873     5808    72625 0.913395317 0.0110192837 0.62345041
## 11 42101001100     652     5925    36564 0.797468354 0.0494514768 0.65029536
## 12 42101001400     450     3528    26897 0.320011338 0.0229591837 0.52636054
## 13 42101001500     622     2192    38026 0.598540146 0.0369525547 0.53102190
## 14 42101001800     514     2462    36458 0.689277011 0.0170593014 0.51056052
## 15 42101001900     422     2689    21766 0.024172555 0.0081814801 0.38750465
## 16 42101002200     487     2135    30156 0.121779859 0.0051522248 0.41358314
## 17 42101002300     425     2644    29806 0.660363086 0.0230711044 0.43154312
## 18 42101002400     474     4262    34247 0.634209291 0.0229938996 0.47536368
## 19 42101002800     427     9255    22759 0.613182064 0.0143706105 0.39535386
## 20 42101002900     488     4050    26744 0.855308642 0.0214814815 0.48790123
## 21 42101003000     410     8160    20294 0.150000000 0.0019607843 0.34595588
## 22 42101003901     463     6192    26347 0.749677003 0.0050064599 0.43782300
## 23 42101003902     468     5630    36472 0.970692718 0.0031971581 0.43570160
## 24 42101004001     457     4036    25854 0.921704658 0.0101585728 0.45936571
## 25 42101004002     484     5024    31045 0.984076433 0.0151273885 0.43829618
## 26 42101004500     520     3049    36335 0.951459495 0.0062315513 0.44375205
## 27 42101004700     622     2540    44080 0.973622047 0.0031496063 0.41141732
## 28 42101004800     638      500    30455 1.000000000 0.0000000000 0.42000000
## 29 42101008301     472     4420    24800 0.020361991 0.0029411765 0.39321267
## 30 42101008302     425     4535    24567 0.005071665 0.0000000000 0.40044101
## 31 42101008400     414     4798    26341 0.015006253 0.0062526053 0.37265527
## 32 42101008500     381     6539    22077 0.028903502 0.0051995718 0.40266096
## 33 42101008600     443     6224    19612 0.154723650 0.0215295630 0.45983290
## 34 42101008700     507     6733    21131 0.457151344 0.0957968216 0.50378732
## 35 42101008800     590     8307     9320 0.584326472 0.1001564945 0.25954015
## 36 42101008900     538     3343     6311 0.561471732 0.1011067903 0.15465151
## 37 42101009000     571     5335    13792 0.621368322 0.0391752577 0.29090909
## 38 42101009100     350     2737    17500 0.299963464 0.0226525393 0.51589331
## 39 42101009200     307     3150    20083 0.026666667 0.0111111111 0.38634921
## 40 42101009300     377     4609    21503 0.002603602 0.0000000000 0.40203949
## 41 42101009400     285     3871    17743 0.009299923 0.0000000000 0.42934642
## 42 42101009500     368     3588    21058 0.000000000 0.0027870680 0.38099220
## 43 42101009600     404     4504    23171 0.017984014 0.0000000000 0.37655417
## 44 42101010000     421     4256    31415 0.336701128 0.0063439850 0.37946429
## 45 42101010200     331     3240    20634 0.008950617 0.0024691358 0.41172840
## 46 42101010400     374     3498    21772 0.010863350 0.0022870212 0.36678102
## 47 42101012600     916      801    31838 0.485642946 0.0399500624 0.40823970
## 48 42101012700     223      689     7500 0.076923077 0.0000000000 0.33671988
## 49 42101012800     841      231    39643 0.831168831 0.0303030303 0.64935065
## 50 42101012900     586      526    41000 0.922053232 0.0000000000 0.54372624
## 51 42101013000     573     1003    35625 0.417746760 0.0368893320 0.50149551
## 52 42101013200     177     2190    14375 0.051598174 0.0000000000 0.31735160
## 53 42101013300     492     2217    29352 0.274695535 0.0211998196 0.46098331
## 54 42101014000     370     2855    13458 0.042732049 0.0122591944 0.41015762
## 55 42101014100     210     2366    12165 0.094674556 0.0000000000 0.41758242
## 56 42101014200     565     2091    30862 0.608321377 0.0162601626 0.45911047
## 57 42101014400     323     3217    23720 0.393534349 0.0027976376 0.35685421
## 58 42101014600     384     3231    26295 0.194986072 0.0000000000 0.30238316
## 59 42101014700     306     2449    16411 0.008166599 0.0000000000 0.31073908
## 60 42101015300     353     3382    14826 0.068598462 0.0165582496 0.34920166
## 61 42101015400     489     2316    31923 0.410621762 0.0000000000 0.07340242
## 62 42101015600     326     1989    14524 0.227249874 0.0045248869 0.29562594
## 63 42101015700     339     2759    22478 0.355201160 0.0050743023 0.32584270
## 64 42101015800     387     5854    31862 0.953877690 0.0015374103 0.38503587
## 65 42101016100     349     5435    19098 0.605887764 0.0000000000 0.32750690
## 66 42101016200     337     2442    13833 0.223177723 0.0000000000 0.32186732
## 67 42101016300     341     3776    15865 0.181938559 0.0052966102 0.28257415
## 68 42101016500     330     2722    16964 0.011756062 0.0022042616 0.33357825
## 69 42101016600     353     1630    14250 0.025766871 0.0055214724 0.39693252
## 70 42101016700     363     6947    16367 0.002303152 0.0061897222 0.37339859
## 71 42101017300     427     3126    21731 0.008637236 0.0000000000 0.36852207
## 72 42101017400     339     2667    15050 0.024371954 0.0029996250 0.34833146
## 73 42101017601     325     5863    11909 0.240661777 0.0010233669 0.30291660
## 74 42101017700     370     9159    15687 0.313789715 0.0005459111 0.28976963
## 75 42101017800     389     6340    14793 0.502996845 0.0037854890 0.33280757
## 76 42101018800     385     7257    22489 0.686371779 0.0000000000 0.34463277
## 77 42101018900     412      942    24032 0.580679406 0.0084925690 0.33757962
## 78 42101019000     427     6869    24650 0.591789198 0.0000000000 0.32581162
## 79 42101019200     390     7519    19375 0.408032983 0.0014629605 0.30110387
## 80 42101019700     422     7139    19877 0.186300602 0.0016809077 0.29079703
## 81 42101020000     378     1756    22786 0.185649203 0.0182232346 0.30410023
## 82 42101020100     349     7732    20436 0.018882566 0.0010346611 0.40597517
## 83 42101020300     371     3147    23277 0.032729584 0.0019065777 0.34413727
## 84 42101020400     386     3398    29814 0.016774573 0.0020600353 0.34873455
## 85 42101027500     472     4823    33953 0.226622434 0.0008293593 0.32821895
## 86 42101027600     463     3932    31316 0.033825025 0.0073753815 0.39267548
## 87 42101027800     426     4867    25690 0.034929114 0.0000000000 0.41956030
## 88 42101028000     485     4819    28281 0.021996265 0.0076779415 0.35692052
## 89 42101028100     401     4051    36129 0.057516663 0.0108615157 0.32189583
## 90 42101028200     457     5448    29142 0.137114537 0.0014684288 0.30231278
## 91 42101028300     425     7270    24725 0.048693260 0.0011004127 0.33301238
## 92 42101029300     375     2898    19205 0.652173913 0.0027605245 0.34817115
## 93 42101029400     386     3232    21288 0.368811881 0.0021658416 0.34096535
## 94 42101030000     388     7150    23876 0.322517483 0.0036363636 0.35356643
## 95 42101030100     427     6050    32224 0.505289256 0.0019834711 0.36082645
## 96 42101030200     411     6694    39072 0.695100090 0.0043322378 0.36241410
## 97 42101036600    1505      618    87027 0.953074434 0.0161812298 0.61650485
##    year                       geometry
## 1  2000 MULTIPOLYGON (((2697812 -65...
## 2  2000 MULTIPOLYGON (((2696038 -65...
## 3  2000 MULTIPOLYGON (((2690929 -64...
## 4  2000 MULTIPOLYGON (((2693009 -66...
## 5  2000 MULTIPOLYGON (((2697206 -67...
## 6  2000 MULTIPOLYGON (((2696968 -68...
## 7  2000 MULTIPOLYGON (((2693753 -67...
## 8  2000 MULTIPOLYGON (((2693685 -68...
## 9  2000 MULTIPOLYGON (((2696756 -68...
## 10 2000 MULTIPOLYGON (((2699725 -68...
## 11 2000 MULTIPOLYGON (((2696783 -69...
## 12 2000 MULTIPOLYGON (((2693328 -70...
## 13 2000 MULTIPOLYGON (((2696596 -70...
## 14 2000 MULTIPOLYGON (((2696927 -71...
## 15 2000 MULTIPOLYGON (((2693190 -71...
## 16 2000 MULTIPOLYGON (((2693025 -72...
## 17 2000 MULTIPOLYGON (((2695666 -73...
## 18 2000 MULTIPOLYGON (((2696812 -72...
## 19 2000 MULTIPOLYGON (((2696561 -74...
## 20 2000 MULTIPOLYGON (((2695096 -74...
## 21 2000 MULTIPOLYGON (((2692869 -74...
## 22 2000 MULTIPOLYGON (((2692473 -76...
## 23 2000 MULTIPOLYGON (((2692147 -78...
## 24 2000 MULTIPOLYGON (((2694362 -76...
## 25 2000 MULTIPOLYGON (((2694135 -78...
## 26 2000 MULTIPOLYGON (((2693880 -80...
## 27 2000 MULTIPOLYGON (((2690184 -81...
## 28 2000 MULTIPOLYGON (((2693677 -82...
## 29 2000 MULTIPOLYGON (((2670305 -64...
## 30 2000 MULTIPOLYGON (((2672050 -64...
## 31 2000 MULTIPOLYGON (((2673150 -64...
## 32 2000 MULTIPOLYGON (((2675430 -64...
## 33 2000 MULTIPOLYGON (((2678189 -65...
## 34 2000 MULTIPOLYGON (((2682357 -65...
## 35 2000 MULTIPOLYGON (((2682978 -65...
## 36 2000 MULTIPOLYGON (((2685201 -66...
## 37 2000 MULTIPOLYGON (((2687003 -63...
## 38 2000 MULTIPOLYGON (((2685048 -63...
## 39 2000 MULTIPOLYGON (((2680537 -64...
## 40 2000 MULTIPOLYGON (((2677296 -63...
## 41 2000 MULTIPOLYGON (((2675716 -63...
## 42 2000 MULTIPOLYGON (((2673363 -61...
## 43 2000 MULTIPOLYGON (((2672293 -62...
## 44 2000 MULTIPOLYGON (((2669883 -59...
## 45 2000 MULTIPOLYGON (((2675291 -61...
## 46 2000 MULTIPOLYGON (((2678548 -61...
## 47 2000 MULTIPOLYGON (((2696344 -63...
## 48 2000 MULTIPOLYGON (((2698113 -63...
## 49 2000 MULTIPOLYGON (((2699439 -63...
## 50 2000 MULTIPOLYGON (((2700853 -63...
## 51 2000 MULTIPOLYGON (((2699624 -62...
## 52 2000 MULTIPOLYGON (((2696642 -61...
## 53 2000 MULTIPOLYGON (((2694718 -60...
## 54 2000 MULTIPOLYGON (((2694993 -58...
## 55 2000 MULTIPOLYGON (((2695544 -59...
## 56 2000 MULTIPOLYGON (((2701823 -61...
## 57 2000 MULTIPOLYGON (((2700270 -58...
## 58 2000 MULTIPOLYGON (((2696717 -57...
## 59 2000 MULTIPOLYGON (((2695197 -57...
## 60 2000 MULTIPOLYGON (((2695468 -54...
## 61 2000 MULTIPOLYGON (((2696989 -55...
## 62 2000 MULTIPOLYGON (((2700636 -56...
## 63 2000 MULTIPOLYGON (((2702130 -56...
## 64 2000 MULTIPOLYGON (((2702836 -57...
## 65 2000 MULTIPOLYGON (((2702305 -54...
## 66 2000 MULTIPOLYGON (((2700153 -54...
## 67 2000 MULTIPOLYGON (((2702563 -53...
## 68 2000 MULTIPOLYGON (((2697775 -52...
## 69 2000 MULTIPOLYGON (((2696508 -52...
## 70 2000 MULTIPOLYGON (((2695889 -52...
## 71 2000 MULTIPOLYGON (((2694518 -48...
## 72 2000 MULTIPOLYGON (((2697775 -52...
## 73 2000 MULTIPOLYGON (((2699943 -52...
## 74 2000 MULTIPOLYGON (((2704118 -49...
## 75 2000 MULTIPOLYGON (((2704108 -53...
## 76 2000 MULTIPOLYGON (((2709922 -48...
## 77 2000 MULTIPOLYGON (((2710580 -48...
## 78 2000 MULTIPOLYGON (((2711724 -43...
## 79 2000 MULTIPOLYGON (((2707100 -48...
## 80 2000 MULTIPOLYGON (((2699239 -42...
## 81 2000 MULTIPOLYGON (((2696092 -50...
## 82 2000 MULTIPOLYGON (((2695815 -44...
## 83 2000 MULTIPOLYGON (((2697659 -42...
## 84 2000 MULTIPOLYGON (((2697239 -41...
## 85 2000 MULTIPOLYGON (((2702629 -32...
## 86 2000 MULTIPOLYGON (((2699591 -32...
## 87 2000 MULTIPOLYGON (((2697819 -32...
## 88 2000 MULTIPOLYGON (((2696067 -38...
## 89 2000 MULTIPOLYGON (((2697957 -36...
## 90 2000 MULTIPOLYGON (((2697957 -36...
## 91 2000 MULTIPOLYGON (((2699709 -38...
## 92 2000 MULTIPOLYGON (((2711963 -42...
## 93 2000 MULTIPOLYGON (((2713128 -47...
## 94 2000 MULTIPOLYGON (((2716617 -41...
## 95 2000 MULTIPOLYGON (((2713405 -39...
## 96 2000 MULTIPOLYGON (((2714957 -37...
## 97 2000 MULTIPOLYGON (((2700290 -66...
print(selectCentroids)
## Simple feature collection with 97 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2667829 ymin: -84785.82 xmax: 2719496 ymax: -32044.96
## Projected CRS: NAD_1983_StatePlane_Pennsylvania_North_FIPS_3701_Feet
## First 10 features:
##    TotalPop                       geometry      Selection_Type
## 1      2576 MULTIPOLYGON (((2697812 -65... Select by Centroids
## 2      1355 MULTIPOLYGON (((2696038 -65... Select by Centroids
## 3      2577 MULTIPOLYGON (((2690929 -64... Select by Centroids
## 4      4387 MULTIPOLYGON (((2693009 -66... Select by Centroids
## 5      1071 MULTIPOLYGON (((2697206 -67... Select by Centroids
## 6      1384 MULTIPOLYGON (((2696968 -68... Select by Centroids
## 7      2550 MULTIPOLYGON (((2693753 -67... Select by Centroids
## 8      8461 MULTIPOLYGON (((2693685 -68... Select by Centroids
## 9      4969 MULTIPOLYGON (((2696756 -68... Select by Centroids
## 10     5808 MULTIPOLYGON (((2699725 -68... Select by Centroids

now the textbook challenges us to create a side by side maps of the various selection types using the facet_wrap layer tool from earlier

all_types_tracts00 <-
    rbind(clip,selection,selectCentroids)

imma set the size of the output. wasnt happy with the default

ggplot() + #creates the plot
  geom_sf(data=st_union(tracts00)) + #creates our city outline
     geom_sf(data = all_types_tracts00, aes(fill = q5(TotalPop))) +
  scale_fill_manual(values = palette5,
                    labels = qBr(all_types_tracts00, "TotalPop"),
                    name = "Popluation\n(Quintile Breaks)") +
  labs(title = "Total Population", subtitle = "Philadelphia; 2000") +
facet_wrap(~Selection_Type) +
  mapTheme()

the below code redoes our centroid approach and labels those tracts whose centroid falls into the buffer as transit oriented development. It label those tracts that do not fall into the buffer as non TOD.

allTracts.group <- 
  rbind( #since left join takes the first argument as the total number of rows, we need to bind the two subsets of the full datset on to of each other.
    st_centroid(allTracts)[buffer,] %>% #selects the tracts whose centroids fall in the buffer
      st_drop_geometry() %>% # drops the geometry to allow easier manipulation as a dataframe or maybe you just cant have two geometries per sf 
      left_join(allTracts) %>% #joins the missing data in our centroid tract data to the data from allTracts. this command outomatically matches columns to prevent duplication
      st_sf() %>% #converts that data to an sf
      mutate(TOD = "TOD"), #creates the TOD column and labels all the current data as TOD, not the comma. everything this line and above is the first argument of rbind
    st_centroid(allTracts)[buffer, op = st_disjoint] %>% # the disjoint option basically inverses the command so that we select all tracts that are not in buffer
      st_drop_geometry() %>%
      left_join(allTracts) %>%
      st_sf() %>%
      mutate(TOD = "Non-TOD")) %>% #labels every tract not in the buffer as non TOD
  mutate(MedRent.inf = ifelse(year == "2000", MedRent * 1.42, MedRent)) # multiplies all the year 2000 rents by inflation
## Warning: st_centroid assumes attributes are constant over geometries
## Joining with `by = join_by(GEOID, MedRent, TotalPop, MedHHInc, pctWhite,
## pctBachelors, pctPoverty, year)`
## Warning: st_centroid assumes attributes are constant over geometries
## Joining with `by = join_by(GEOID, MedRent, TotalPop, MedHHInc, pctWhite,
## pctBachelors, pctPoverty, year)`

now the textbook is giving us an excercise to recreate a map

ggplot() + #creates the plot
  geom_sf(data=st_union(allTracts.group)) + #creates our city outline
     geom_sf(data = allTracts.group, aes(fill = q5(MedRent.inf)), colour = NA) + # creates of visualization of the rents by tract
  scale_fill_manual(values = palette5, # choosing fill colors and other aesthetic attributes
                    labels = qBr(allTracts.group, "MedRent"),
                    name = "Rent\n(Quintile Breaks)") +
    geom_sf(data = buffer, fill = "grey", color = "red", alpha = 0.3) + # creates the buffer layer
  labs(title = "Median Rent", subtitle = "Real dollars: red borders denote buffer areas around subway stops") +
facet_wrap(~year) +
  mapTheme()

now that we have created our visual aid, we wanna do some comparisons in other formats including tables

allTracts.Summary <- 
  st_drop_geometry(allTracts.group) %>%
    group_by(year, TOD) %>%
    summarize(Rent = mean(MedRent, na.rm = T), #the summarize function creates one row for each combination of grouping variable
              Population = mean(TotalPop, na.rm = T), # we are manually specifiyinging our rows and calculations to be done in each row
              Percent_White = mean(pctWhite, na.rm = T),
              Percent_Bach = mean(pctBachelors, na.rm = T),
              Percent_Poverty = mean(pctPoverty, na.rm = T))
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
kable(allTracts.Summary) %>%
  kable_styling() %>%
  footnote(general_title = "\n",
           general = "Table 1.2")
year TOD Rent Population Percent_White Percent_Bach Percent_Poverty
2000 Non-TOD 470.5458 3966.789 0.4695256 0.0096146 0.3735100
2000 TOD 469.8247 4030.742 0.3848745 0.0161826 0.4031254
2017 Non-TOD 821.1642 4073.547 0.4396967 0.0116228 0.2373258
2017 TOD 913.3750 3658.500 0.4803197 0.0288166 0.3080936

Table 1.2

now we are gonna take our data frame and inverse it to see the variables as the rows

allTracts.Summary %>%
  unite(year.TOD, year, TOD, sep = ": ", remove = T) %>% # creates a new column that unites year and TOD. Remove option removes the old columns I imagine
    print() %>%
  gather(Variable, Value, -year.TOD) %>% # converts our data to long format. year.TOD is our "grouping variable", "variable" is the key, and "value" is the value. in practice when using the gather function, the grouping variable column is the column that stays put in the dataframe. It's the variable that the other variables get organized around. the key variable is what we call the new column that is made up of a stack of all the other variable that are not the grouping variable. the - sign denotes that we are excluding a variable from being gathered.
     print() %>%
  mutate(Value = round(Value, 2)) %>%
  spread(year.TOD, Value) %>% # this inverts our table, spreading year.TOD into separate columns alongside a Variable column that now has unique values for each row. Value is specified as the column to appear as values in the matrix
  kable() %>%
    kable_styling() %>%
    footnote(general_title = "\n",
             general = "Table 1.3")
## # A tibble: 4 × 6
##   year.TOD       Rent Population Percent_White Percent_Bach Percent_Poverty
##   <chr>         <dbl>      <dbl>         <dbl>        <dbl>           <dbl>
## 1 2000: Non-TOD  471.      3967.         0.470      0.00961           0.374
## 2 2000: TOD      470.      4031.         0.385      0.0162            0.403
## 3 2017: Non-TOD  821.      4074.         0.440      0.0116            0.237
## 4 2017: TOD      913.      3658.         0.480      0.0288            0.308
## # A tibble: 20 × 3
##    year.TOD      Variable             Value
##    <chr>         <chr>                <dbl>
##  1 2000: Non-TOD Rent             471.     
##  2 2000: TOD     Rent             470.     
##  3 2017: Non-TOD Rent             821.     
##  4 2017: TOD     Rent             913.     
##  5 2000: Non-TOD Population      3967.     
##  6 2000: TOD     Population      4031.     
##  7 2017: Non-TOD Population      4074.     
##  8 2017: TOD     Population      3658.     
##  9 2000: Non-TOD Percent_White      0.470  
## 10 2000: TOD     Percent_White      0.385  
## 11 2017: Non-TOD Percent_White      0.440  
## 12 2017: TOD     Percent_White      0.480  
## 13 2000: Non-TOD Percent_Bach       0.00961
## 14 2000: TOD     Percent_Bach       0.0162 
## 15 2017: Non-TOD Percent_Bach       0.0116 
## 16 2017: TOD     Percent_Bach       0.0288 
## 17 2000: Non-TOD Percent_Poverty    0.374  
## 18 2000: TOD     Percent_Poverty    0.403  
## 19 2017: Non-TOD Percent_Poverty    0.237  
## 20 2017: TOD     Percent_Poverty    0.308
Variable 2000: Non-TOD 2000: TOD 2017: Non-TOD 2017: TOD
Percent_Bach 0.01 0.02 0.01 0.03
Percent_Poverty 0.37 0.40 0.24 0.31
Percent_White 0.47 0.38 0.44 0.48
Population 3966.79 4030.74 4073.55 3658.50
Rent 470.55 469.82 821.16 913.38

Table 1.3

now the textbook shows us how to display this data as bar plots

allTracts.Summary %>%
  gather(Variable, Value, -year, -TOD) %>% # gather our data leaving both year and TOD as grouping variables
  ggplot(aes(year, Value, fill = TOD)) + # graph year on the x axis and value on the y axis, and specifies that we are filling the bars based on the two TOD groups
    geom_bar(stat = "identity", position = "dodge") +
    facet_wrap(~Variable, scales = "free", ncol=5) + #specifies that we want separare plots for each of the variables
    scale_fill_manual(values = c("#bae4bc", "#0868ac")) +
    labs(title = "Indicator differences across time and space") +
    plotTheme() + theme(legend.position="bottom",
                        strip.text.x = element_text(size = 10)) # changes the size of the strip labels at the top of each plot.
## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

now the textbook creates 3 submarkets. One for the center of the city and one for each subway line. we want to see if the central city has had a disproportionate increase in rental prices compared to the other transit oriented development areas. if so, we may conclude that the increase at the center of the city skewed or TOD data.

centerCity <-
  st_intersection( #creates a layer that only includes the intersecting buffers of the two subway lines
    st_buffer(filter(septaStops, Line == "El"), 2640) %>% st_union(), #we start from our union of points sf layer. so we first create a buffer for both separate lines
    st_buffer(filter(septaStops, Line == "Broad_St"), 2640) %>% st_union()) %>%
      st_sf() %>% #converts to sf layer. not sure why this is necessary since we never dropped geometry
      mutate(Submarket = "Center City") #creates a column in our sf called submarket that is labelled City Center.

el <-
  st_buffer(filter(septaStops, Line == "El"), 2640) %>% st_union() %>%
    st_sf() %>%
    st_difference(centerCity) %>% #erases any portion of the layer that overlaps with the city center
    mutate(Submarket = "El")
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries
broad.st <-
  st_buffer(filter(septaStops, Line == "Broad_St"), 2640) %>% st_union() %>%
    st_sf() %>%
    st_difference(centerCity) %>%
    mutate(Submarket = "Broad Street")
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries
threeMarkets <- rbind(el, broad.st, centerCity) #stacks the three separate sfs on top of each other

now we are going to assign all our tracts to three 4 categories (central city, el, non tod.. etc) based on what buffer each centroid falls in.

allTracts.threeMarkets <-
  st_join(st_centroid(allTracts), threeMarkets) %>% # st_join joins two separate sf's together. in this case it joins threemarkets to the left of alltracts based on which market buffer from threemarkets each centroid from alltracts falls into. the geomtry that is kept is the centroid from alltracts  
  st_drop_geometry() %>% # drops the centroid geometry
  left_join(allTracts) %>% #joins alltracts to the left of what we are left with from the above code matching columns automatically. essentially, we are just getting the geometries from allTracts
  mutate(Submarket = replace_na(Submarket, "Non-TOD")) %>% #replaces all the na's in the submarket column with "Non-TOD"
  st_sf() 
## Warning: st_centroid assumes attributes are constant over geometries
## Joining with `by = join_by(GEOID, MedRent, TotalPop, MedHHInc, pctWhite,
## pctBachelors, pctPoverty, year)`

It doesnt give us the code to graph the separate markets so let’s try to do it oursevles

ggplot() + #creates the plot
  geom_sf(data=st_union(allTracts.threeMarkets)) + #creates our city outline
     geom_sf(data = allTracts.threeMarkets, aes(fill = Submarket)) +
  mapTheme()